Skip to content

Fix NonDaemonThreads flush race causing MessageData loss on Java 21 OpenJ9#4759

Draft
Copilot wants to merge 2 commits into
mainfrom
copilot/fix-nondaemonthreadstest-failure
Draft

Fix NonDaemonThreads flush race causing MessageData loss on Java 21 OpenJ9#4759
Copilot wants to merge 2 commits into
mainfrom
copilot/fix-nondaemonthreadstest-failure

Conversation

Copilot AI commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

NonDaemonThreadsTest$Java21OpenJ9Test was timing out waiting for MessageData because the "done" log record was never exported before the process exited.

Root cause

SecondEntryPoint.flushAll() chained telemetryClient.forceFlush() inside an async whenComplete() callback on sdk.shutdown(). On Java 21 OpenJ9, the OTel SDK's BatchLogRecordProcessor daemon thread could stall during shutdown, leaving the result code unresolved. With no callback ever firing, telemetryClient.forceFlush() was never called. The 10-second join() timeout expired, the shutdown hook thread exited, daemon threads were killed, and the buffered log was lost.

A secondary issue in BatchItemProcessor: if the daemon worker was interrupted during signal.poll(), it exited silently while leaving flushRequested unresolved — causing forceFlush() callers to block until the outer timeout.

Changes

  • SecondEntryPoint.flushAll() — replace the async callback chain with a synchronous bounded wait + unconditional flush:

    // Before: telemetryClient.forceFlush() only called if sdk.shutdown() completes
    sdk.shutdown().whenComplete(() -> telemetryClient.forceFlush());
    
    // After: always flush, even if SDK shutdown stalls
    sdk.shutdown().join(5, TimeUnit.SECONDS);
    return telemetryClient.forceFlush();
  • BatchItemProcessor worker InterruptedException handler — fail any pending flushRequested result code before returning, so callers are unblocked immediately rather than hanging until the outer timeout.

…tdown() and InterruptedException handling

Co-authored-by: johnoliver <1615532+johnoliver@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix failing GitHub Actions job NonDaemonThreads:NonDaemonThreadsTest Fix NonDaemonThreads flush race causing MessageData loss on Java 21 OpenJ9 Jun 18, 2026
Copilot AI requested a review from johnoliver June 18, 2026 12:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants